Aligning WordNet Synsets and Wikipedia Articles

نویسندگان

  • Samuel Fernando
  • Mark Stevenson
چکیده

This paper examines the problem of finding articles in Wikipedia to match noun synsets in WordNet. The motivation is that these articles enrich the synsets with much more information than is already present in WordNet. Two methods are used. The first is title matching, following redirects and disambiguation links. The second is information retrieval over the set of articles. The methods are evaluated over a random sample set of 200 noun synsets which were manually annotated. With 10 candidate articles retrieved for each noun synset, the methods achieve recall of 93%. The manually annotated data set and the automatically generated candidate article sets are available online for research purposes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aligning Sense Inventories in Wikipedia and WordNet

In this paper, we study the alignment of Wikipedia articles and WordNet synsets. Therefore, we propose a method to convert Wikipedia to a sense inventory. We show that an aligned sense inventory of both resources has two major benefits: the coverage of senses can be increased and enhanced information about aligned senses can be obtained. Our study and conclusions are based on human annotations ...

متن کامل

Mapping WordNet synsets to Wikipedia articles

Lexical knowledge bases (LKBs), such as WordNet, have been shown to be useful for a range of language processing tasks. Extending these resources is an expensive and time-consuming process. This paper describes an approach to address this problem by automatically generating a mapping from WordNet synsets to Wikipedia articles. A sample of synsets has been manually annotated with article matches...

متن کامل

Constructing a class hierarchy with properties by refining and aligning Japanese wikipedia ontology and Japanese WordNet

Introduction We have proposed learning methods for building a large-scale and high accuracy general ontology called Japanese Wikipedia Ontology (JWO) by extracting the concepts and relationships between concepts from various semistructured resources in Japanese Wikipedia [3]. However, JWO has problems because it lacks upper classes and appropriate definitions of properties. Thus, the aim of our...

متن کامل

Learning the semantics of Wikipedia hyperlinks

I claim that hyperlinks in Wikipedia entries often correspond to semantic relationships between concepts, described by the entries. This bachelor’s thesis discusses supervised methods to automatically identify new links that correspond to a given relation (hyper-/or hyponymy). Training data is collected by mapping Wikipedia articles to WordNet synsets and then marking links where a relation bet...

متن کامل

The People's Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet

We propose a method to automatically alignWordNet synsets andWikipedia articles to obtain a sense inventory of higher coverage and quality. For eachWordNet synset, we first extract a set of Wikipedia articles as alignment candidates; in a second step, we determine which article (if any) is a valid alignment, i.e. is about the same sense or concept. In this paper, we go significantly beyond stat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010